117 research outputs found

    Presentation in Free-Form Space: Managing Ambiguity with Hypermedia Pathways While Supporting Ideation

    Get PDF
    Traditional Slideware presentation tools (e.g. PowerPoint) suffer from the problem of premature formalism, which interferes with how authors develop new knowledge. Free-form spatial content organization can overcome this problem, by allowing users to express multiple, emerging relationships among content elements. Although its ambiguity fosters interpretation of relationships for both authors and audiences, the ambiguity will make presentation more challenging to perform. Therefore, we integrated hypermedia pathways with a free-form space to support presentations. We conducted a field study, addressing 158 users to understand authors’ experiences of creating content in free-form space, integrated with hypermedia pathways for presentation. Our findings show that this integration supports users in not only developing new ideas, but also in performing the presentations

    An Information-Theoretic Framework for Evaluating Edge Bundling Visualization

    Get PDF
    Edge bundling is a promising graph visualization approach to simplifying the visual result of a graph drawing. Plenty of edge bundling methods have been developed to generate diverse graph layouts. However, it is difficult to defend an edge bundling method with its resulting layout against other edge bundling methods as a clear theoretic evaluation framework is absent in the literature. In this paper, we propose an information-theoretic framework to evaluate the visual results of edge bundling techniques. We first illustrate the advantage of edge bundling visualizations for large graphs, and pinpoint the ambiguity resulting from drawing results. Second, we define and quantify the amount of information delivered by edge bundling visualization from the underlying network using information theory. Third, we propose a new algorithm to evaluate the resulting layouts of edge bundling using the amount of the mutual information between a raw network dataset and its edge bundling visualization. Comparison examples based on the proposed framework between different edge bundling techniques are presented

    Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge

    Full text link
    In this paper, we describe the systems developed by the SJTU X-LANCE team for LIMMITS 2023 Challenge, and we mainly focus on the winning system on naturalness for track 1. The aim of this challenge is to build a multi-speaker multi-lingual text-to-speech (TTS) system for Marathi, Hindi and Telugu. Each of the languages has a male and a female speaker in the given dataset. In track 1, only 5 hours data from each speaker can be selected to train the TTS model. Our system is based on the recently proposed VQTTS that utilizes VQ acoustic feature rather than mel-spectrogram. We introduce additional speaker embeddings and language embeddings to VQTTS for controlling the speaker and language information. In the cross-lingual evaluations where we need to synthesize speech in a cross-lingual speaker's voice, we provide a native speaker's embedding to the acoustic model and the target speaker's embedding to the vocoder. In the subjective MOS listening test on naturalness, our system achieves 4.77 which ranks first.Comment: Accepted by ICASSP 2023 Special Session for Grand Challenge

    Matrix GARCH Model: Inference and Application

    Full text link
    Matrix-variate time series data are largely available in applications. However, no attempt has been made to study their conditional heteroskedasticity that is often observed in economic and financial data. To address this gap, we propose a novel matrix generalized autoregressive conditional heteroskedasticity (GARCH) model to capture the dynamics of conditional row and column covariance matrices of matrix time series. The key innovation of the matrix GARCH model is the use of a univariate GARCH specification for the trace of conditional row or column covariance matrix, which allows for the identification of conditional row and column covariance matrices. Moreover, we introduce a quasi maximum likelihood estimator (QMLE) for model estimation and develop a portmanteau test for model diagnostic checking. Simulation studies are conducted to assess the finite-sample performance of the QMLE and portmanteau test. To handle large dimensional matrix time series, we also propose a matrix factor GARCH model. Finally, we demonstrate the superiority of the matrix GARCH and matrix factor GARCH models over existing multivariate GARCH-type models in volatility forecasting and portfolio allocations using three applications on credit default swap prices, global stock sector indices, and future prices

    A Novel LiDAR-Based Instrument for High-Throughput, 3D Measurement of Morphological Traits in Maize and Sorghum

    Get PDF
    Recently, imaged-based approaches have developed rapidly for high-throughput plant phenotyping (HTPP). Imaging reduces a 3D plant into 2D images, which makes the retrieval of plant morphological traits challenging. We developed a novel LiDAR-based phenotyping instrument to generate 3D point clouds of single plants. The instrument combined a LiDAR scanner with a precision rotation stage on which an individual plant was placed. A LabVIEW program was developed to control the scanning and rotation motion, synchronize the measurements from both devices, and capture a 360â—¦ view point cloud. A data processing pipeline was developed for noise removal, voxelization, triangulation, and plant leaf surface reconstruction. Once the leaf digital surfaces were reconstructed, plant morphological traits, including individual and total leaf area, leaf inclination angle, and leaf angular distribution, were derived. The system was tested with maize and sorghum plants. The results showed that leaf area measurements by the instrument were highly correlated with the reference methods (R2 \u3e 0.91 for individual leaf area; R2 \u3e 0.95 for total leaf area of each plant). Leaf angular distributions of the two species were also derived. This instrument could fill a critical technological gap for indoor HTPP of plant morphological traits in 3D

    Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

    Full text link
    Self-supervised learning (SSL) proficiency in speech-related tasks has driven research into utilizing discrete tokens for speech tasks like recognition and translation, which offer lower storage requirements and great potential to employ natural language processing techniques. However, these studies, mainly single-task focused, faced challenges like overfitting and performance degradation in speech recognition tasks, often at the cost of sacrificing performance in multi-task scenarios. This study presents a comprehensive comparison and optimization of discrete tokens generated by various leading SSL models in speech recognition and synthesis tasks. We aim to explore the universality of speech discrete tokens across multiple speech tasks. Experimental results demonstrate that discrete tokens achieve comparable results against systems trained on FBank features in speech recognition tasks and outperform mel-spectrogram features in speech synthesis in subjective and objective metrics. These findings suggest that universal discrete tokens have enormous potential in various speech-related tasks. Our work is open-source and publicly available to facilitate research in this direction

    \u3ci\u3ePhenoImage\u3c/i\u3e: An open-source graphical user interface for plant image analysis

    Get PDF
    High-throughput genotyping coupled with molecular breeding approaches have dramatically accelerated crop improvement programs. More recently, improved plant phenotyping methods have led to a shift from manual measurements to automated platforms with increased scalability and resolution. Considerable effort has also gone into developing large-scale downstream processing of the imaging datasets derived from high-throughput phenotyping (HTP) platforms. However, most available tools require some programming skills.We developed PhenoImage, an open-source graphical user interface (GUI) based cross-platform solution for HTP image processing intending to make image analysis accessible to users with either little or no programming skills. The open-source nature provides the possibility to extend its usability to meet user-specific requirements. The availability of multiple functions and filtering parameters provides flexibility to analyze images from a wide variety of plant species and platforms. PhenoImage can be run on a personal computer as well as on high-performance computing clusters. To test the efficacy of the application, we analyzed the LemnaTec Imaging system derived red, green, and blue (RGB) color intensity and plant pigmentation-based fluorescence shoot images from two plant species: sorghum [Sorghum bicolor (L.) Moench] and wheat (Triticum aestivum L.) differing in their physical attributes. In the study, we discuss the development, implementation, and working of the PhenoImage

    Effects of Coronal Magnetic Field Configuration on Particle Acceleration and Release during the Ground Level Enhancement Events in Solar Cycle 24

    Full text link
    Ground level enhancements (GLEs) are extreme solar energetic particle (SEP) events that are of particular importance in space weather. In solar cycle 24, two GLEs were recorded on 2012 May 17 (GLE 71) and 2017 September 10 (GLE 72), respectively, by a range of advanced modern instruments. Here we conduct a comparative analysis of the two events by focusing on the effects of large-scale magnetic field configuration near active regions on particle acceleration and release. Although the active regions both located near the western limb, temporal variations of SEP intensities and energy spectra measured in-situ display different behaviors at early stages. By combining a potential field model, we find the CME in GLE 71 originated below the streamer belt, while in GLE 72 near the edge of the streamer belt. We reconstruct the CME shock fronts with an ellipsoid model based on nearly simultaneous coronagraph images from multi-viewpoints, and further derive the 3D shock geometry at the GLE onset. The highest-energy particles are primarily accelerated in the shock-streamer interaction regions, i.e., likely at the nose of the shock in GLE 71 and the eastern flank in GLE 72, due to quasi-perpendicular shock geometry and confinement of closed fields. Subsequently, they are released to the field lines connecting to near-Earth spacecraft when the shocks move through the streamer cusp region. This suggests that magnetic structures in the corona, especially shock-streamer interactions, may have played an important role in the acceleration and release of the highest-energy particles in the two events.Comment: Accepted for publication in Ap

    UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

    Full text link
    The utilization of discrete speech tokens, divided into semantic tokens and acoustic tokens, has been proven superior to traditional acoustic feature mel-spectrograms in terms of naturalness and robustness for text-to-speech (TTS) synthesis. Recent popular models, such as VALL-E and SPEAR-TTS, allow zero-shot speaker adaptation through auto-regressive (AR) continuation of acoustic tokens extracted from a short speech prompt. However, these AR models are restricted to generate speech only in a left-to-right direction, making them unsuitable for speech editing where both preceding and following contexts are provided. Furthermore, these models rely on acoustic tokens, which have audio quality limitations imposed by the performance of audio codec models. In this study, we propose a unified context-aware TTS framework called UniCATS, which is capable of both speech continuation and editing. UniCATS comprises two components, an acoustic model CTX-txt2vec and a vocoder CTX-vec2wav. CTX-txt2vec employs contextual VQ-diffusion to predict semantic tokens from the input text, enabling it to incorporate the semantic context and maintain seamless concatenation with the surrounding context. Following that, CTX-vec2wav utilizes contextual vocoding to convert these semantic tokens into waveforms, taking into consideration the acoustic context. Our experimental results demonstrate that CTX-vec2wav outperforms HifiGAN and AudioLM in terms of speech resynthesis from semantic tokens. Moreover, we show that UniCATS achieves state-of-the-art performance in both speech continuation and editing

    PI‑Plat: a high‑resolution image‑based 3D reconstruction method to estimate growth dynamics of rice inflorescence traits

    Get PDF
    Background: Recent advances in image-based plant phenotyping have improved our capability to study vegetative stage growth dynamics. However, more complex agronomic traits such as inflorescence architecture (IA), which predominantly contributes to grain crop yield are more challenging to quantify and hence are relatively less explored. Previous efforts to estimate inflorescence-related traits using image-based phenotyping have been limited to destructive end-point measurements. Development of non-destructive inflorescence phenotyping platforms could accelerate the discovery of the phenotypic variation with respect to inflorescence dynamics and mapping of the underlying genes regulating critical yield components. Results: The major objective of this study is to evaluate post-fertilization development and growth dynamics of inflorescence at high spatial and temporal resolution in rice. For this, we developed the Panicle Imaging Platform (PI-Plat) to comprehend multi-dimensional features of IA in a non-destructive manner. We used 11 rice genotypes to capture multi-view images of primary panicle on weekly basis after the fertilization. These images were used to reconstruct a 3D point cloud of the panicle, which enabled us to extract digital traits such as voxel count and color intensity. We found that the voxel count of developing panicles is positively correlated with seed number and weight at maturity. The voxel count from developing panicles projected overall volumes that increased during the grain filling phase, wherein quantification of color intensity estimated the rate of panicle maturation. Our 3D based phenotyping solution showed superior performance compared to conventional 2D based approaches. Conclusions: For harnessing the potential of the existing genetic resources, we need a comprehensive understanding of the genotype-to-phenotype relationship. Relatively low-cost sequencing platforms have facilitated high-throughput genotyping, while phenotyping, especially for complex traits, has posed major challenges for crop improvement. PI-Plat offers a low cost and high-resolution platform to phenotype inflorescence-related traits using 3D reconstruction-based approach. Further, the non-destructive nature of the platform facilitates analyses of the same panicle at multiple developmental time points, which can be utilized to explore the genetic variation for dynamic inflorescence traits in cereals
    • …
    corecore